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Abstract 



We present an exact local bosonic algorithm for the simulation of dynamical fermions 
in lattice QCD. It is based on a non-hermitian polynomial approximation of the inverse 
of the quark matrix and a global Metropolis accept /reject correction of the systematic 
errors. We show that this algorithm is a real alternative to the Hybrid Monte Carlo 
C*~) ' algorithm. 

o ' 

! 1 Introduction 



c3 . 

The search for more efficient full QCD algorithms has motivated substantial activity, both 
Q_i! within the classical Hybrid Monte Carlo (HMC) and in the alternative method of the local 

bosonic algorithm. This last algorithm was proposed by Liischer in a hermitian version 
Q. The idea was to approximate the full QCD partition function using a local bosonic 
action based on a polynomial approximation of the inverse of the squared Wilson fermion 
matrix. The aim was to obtain an algorithm which does not need the explicit inversion of 
the fermion matrix and which only uses local, finite-step-size updates, contrary to HMC. 

In this work we study several significant improvements of this algorithm: an inexpensive 
stochastic Metropolis accept /reject test Q to make the algorithm exact; a non-hermitian 
polynomial approximation |^] and a simple even-odd preconditioning [||, ||]. Here we con- 
sider two-flavor QCD, and we present a description of the algorithm. We illustrate the 
effectiveness of our algorithm with representative Monte Carlo results. We successfully 
model its static properties (the Monte Carlo acceptance), and try to disentangle its dynam- 
ics. 

From our analysis it appears that this version of the local bosonic algorithm is a real 
alternative to the classic HMC. For heavy quarks its finite-step dynamics are comparable to 
quenched Monte Carlo, and are considerably faster than HMC. And for large volumes and 
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small quark masses, the scaling of our algorithm compares favorably with that of HMC. 
Thus our algorithm becomes attractive for light quarks on large lattices as well. Finally, 
because it uses local updating techniques, it is not affected by the accumulation of roundoff 
errors which can marr the reversibility of HMC in such cases. 



2 Description of the algorithm 

The full QCD partition function with two fermion flavors is given by 

Z = J[dU]\detD\ 2 e~ Sc[u] (1) 

where D represents the fermion matrix and Sq denotes the pure gauge action. This partition 
function can be approximated introducing a polynomial approximation of the inverse of the 
fermion matrix. 

i detD i 2 = detD,detD "idsnW < 2 > 

The polynomial P(z) = Ylk=i( z ~ z k) of degree n is defined in the complex plane and 
approximates the inverse of z. Since we are investigating full QCD with two flavors, the 
determinant \ det P(D)\ 2 manifestly factorizes into positive pairs, so that the rg^TTDTp 
term of the approximation (|2|) can be expressed by a Gaussian integral over a set of boson 
fields 4>k (k = 1, ...,n) with color and Dirac indices. The full QCD partition function (|l|) is 
then approximated by 

Z ~ J [dU][d<t>][d^] e ~ SL[u ^ ] (3) 

where <f> represents the set of all boson field families, Sl is the local action Sl = Sq + Sb 
and Sb = J2k=i \ iP ~ z k)4>k\ 2 - Making use of the locality of Sl we may now simulate 
the partition function (||) by locally updating the boson fields and the gauge fields, using 
heat-bath and over-relaxation algorithms. 



The simulation of full QCD can be obtained from (0) by correcting the errors due to 
the approximation through a Metropolis test at the end of each trajectory. Introducing the 
error term in the partition function we obtain 

Z= f[dU}[d^][d^]\det(DP(D))\ 2 e- s ^ U '^ (4) 
The correction term | det(DP(D))\ 2 can be evaluated in two different ways. 

The first one consists in estimating the determinant | det(DP(D))\ 2 using a noisy esti- 
mator Q. The strategy of this method is to update the (U, cj>) fields such that the probability 
of finding a particular configuration is proportional to e~ Sh and then perform a Metropolis 
test defining an acceptance probability P A in terms of the noisy estimation of the correction 
| det (-DP(-D))| 2 . The acceptance probability P A is given by 

ni^(^)= min ( i ' eHWx|2+lx|2 ) ( 5 ) 
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where W = [D'P(D')] 1 DP(D). In this case the algorithm satisfies detailed balance after 
averaging over the Gaussian noise x- 

The second method consists in expressing the correction term | det(DP(D))\ 2 directly 
by a Gaussian integral and incorporating the dynamics of the correction field rj in the 
partition sum (§) 

Z= f[dU][dri][dr]^[d(()][d^}e- Se ^ u '^ (6) 



by defining a new exact action 

Sexact(U, 0, 7?) = S L (U, <f>) + S C {U, 7?) (7) 

where S G (U,rf) = \[DP(D)]~ 1 rj\ 2 is the correction action. In carrying out the simulation 
one generates configurations of (U, 4>) and rj such that the probability of finding a particular 
configuration is proportional to exp(— S exac t). Also in this case the strategy is to alterna- 
tively update the (U, (f>) fields and the r\ fields. The Metropolis acceptance probability P A 
is not defined using a noisy estimation of the correction | det(DP(D))\ 2 , but directly using 
the exact action S exac t so that the transition probability of the algorithm satisfies detailed 
balance without the need to average over r\. In this case the acceptance probability is again 
defined in terms of a Gaussian vector by 

min (l, e-l^'+W 2 ) if S G {U) > S G (U') 



P (U,ct>)^(U',<i>>) - { . 4-IM/-lv|2_M2\ . r „ /rrN . „ srrrs ( 8 ) 



mm 



1 5 e +\w^ x \ 2 -\x\ 2 ^ if Sg (U) < S G {U') 



We have tested both methods and both seem equally efficient. In Q we have presented 
a formal proof that both algorithms converge to the right distribution. 



The algorithm can be summarized as follows: 

• Generate a new Gaussian spinor x with variance one. 

• Update locally the boson and gauge fields m times (in reversible order) according to 
the approximate partition function (||). 

• Accept /reject the new configuration according to the Metropolis acceptance proba- 
bility (|5|) for the noisy version or (||) for the non-noisy version. 

In order to evaluate the Metropolis acceptance probability we have to solve a linear 
system involving DP(D) or D'P(D'), for which we use the BiCGstab algorithm Q. This 
linear system is very well conditioned because P(D') (or P(D)) approximates the inverse of 
D' (or D). The cost for solving it is minimal and scales like the local updating algorithms 
in the volume and quark mass. 

We emphasize that our algorithm remains exact for any choice of the polynomial P. 
If the polynomial approximates the inverse of the fermion matrix well, the acceptance of 
the Metropolis correction will be high; if not the acceptance will be low. The number and 
location of the roots in the complex plane determine the quality of the approximation and 
hence the acceptance. Since the algorithm is exact for any polynomial, a priori knowledge 
about the spectrum of the fermion matrix is not required. 
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3 Results for the exact local bosonic algorithm. 



Numerical simulations using the exact local bosonic algorithm in the non-noisy version de- 
scribed in the previous section have been performed for different lattice parameters. The 
majority of the simulations are reported in 0. 

We explored the acceptance of the Metropolis correction test and the number of it- 
erations of the BiCGStab algorithm used in that test for inverting DP{D). The study 
was performed by varying the degree n of the polynomial and the hopping parameter k. 
One observes that the acceptance increases quite rapidly with the degree of the polynomial, 
above some threshold (see Figs. 1,2,3). On the other hand, the number of iterations needed 
to invert DP(D) remains very low for high enough acceptance. The data show clearly that 
the overhead due to the Metropolis test remains negligible provided that the degree of the 
polynomial is tuned to have sufficient acceptance. Results obtained from simulations using 
even-odd preconditioning confirm, as expected, that the improvement of the approximation 
reduces the required number of bosonic fields by at least a factor two. 



4 Predicting the Metropolis acceptance 

Using some general assumptions and the known error bounds on the Chebyshev-like ap- 
proximation of D~ , we can obtain [Q] an ansatz for the acceptance probability, which at 
(3 = takes the form 

(/ K \ n+1 \ 
ZvWf-j J (9) 

where V is the lattice volume, K is the hopping parameter of the Wilson fermion matrix 
D and K c is the critical hopping parameter. We expect the fitting parameter / to depend 
smoothly on f3, but very little on n, V and K. Figs. 1, 2 and 3 show the acceptance we 
measured during our Monte- Carlo simulations at (5 = 0, as a function of n, for 2 different 
volumes and 2 different .fT's. The fit (eq. @) is shown by the dotted lines. All 3 figures 
have been obtained with the same value / = 0.19. We thus consider our ansatz (|9|) quite 
satisfactory. At other values of /3, one can fix / by a test on a small lattice, and then predict 
the acceptance for larger volumes and different quark masses. We will use our ansatz below 
to analyze the cost of our algorithm. 



5 Understanding the dynamics 

The coupled dynamics of the gauge and bosonic fields in Liischer's method are rather sub- 
tle. The long autocorrelation times rocn were explained in ||, ^| for the local algorithm 
without the global Metropolis test. 

Since we now have a reasonable understanding of the Metropolis acceptance and of the 
dynamics without the Metropolis test, we can see the effect of the one on the other, and 
then estimate the total cost of the algorithm per independent configuration. Calling r and 
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tq the autocorrelation time with and without Metropolis respectively, we obtain 



1 



-1 



(10) 



r = 



log(l- < Pace > (1 " e- m /T0)) 



for a trajectory of m sweeps. Note that to is measured in sweeps and r in trajectories. 
Folding into ([H]) our ansatz for < P acc > (eq. ([])), and measuring the proportionality 
constant C for the autocorrelation without the Metropolis test tq ~ Cn, we obtain the 
autocorrelation time as a function of n. An example is plotted in Fig. 4 for lattices of sizes 
4, 8, 16, 32, at (5 = and K = 0.215, with m = 10 (for this lattice parameters C ~ 1.3). 
The behavior will be qualitatively similar for other choices of parameters. The Monte Carlo 
data shown in Fig. 4 was obtained on a 4 4 lattice, and is roughly compatible with eq. (|l0|) . 

The operation count of the algorithm is about 6n matrix-vector multiplications by the 
Dirac operator D per sweep. Therefore we can estimate the total cost of our algorithm to 
produce an independent configuration, measured in multiplications by D per lattice site, 
as a function of the number of fields or the Metropolis acceptance. As expected, one can 
observe that the optimal number of bosonic fields grows logarithmically with the volume; 
the acceptance must be kept high, with a fairly broad optimum around 70 — 80%. 

6 Scaling 

The scaling of the number of bosonic fields and of the total complexity with the volume 
and the quark mass has already been discussed in ||, ||. Our analysis confirms these 
earlier estimates. 

• As the volume V increases, the number n of bosonic fields should grow like logV . This 
is a consequence of the exponential convergence of the polynomial approximation. Since 
the work per sweep is proportional to nV, and the autocorrelation time to n, the work per 
independent configuration grows like V(logV) 2 . This is an asymptotically slower growth 
than Hybrid Monte Carlo which requires work ~ V 5 ^. But this is more of an academic 
than a practical advantage. 

• As the quark mass m q decreases, the number of fields n must grow like m~ l . This can 
be derived from (||), using m q oc 1 — K/K c , which applies for small quark masses. The work 
per sweep is proportional to n. The autocorrelation time behavior is less clear. We expect 
that the autocorrelation behaves like r ~ nm~ a , since m q enters in the effective mass term 
of each harmonic piece of the action Sb, and since a factor n comes from the autocorrelation 
of the gauge fields alone. There is no theoretical understanding of the coupling between the 
gauge and boson fields, so to determine a our main argument comes from MC data. From 
an exploratory simulation Q on a 4 4 lattice at f3 = we obtained that a is near 1. This is 
only indicative, because the dependence of a with the volume and the coupling has yet to 
be explored. 
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7 Conclusion 



We have presented an alternative algorithm to HMC for simulating dynamical quarks in 
lattice QCD. This algorithm is based on a local bosonic action. A non-hermitian polynomial 
approximation of the inverse of the quark matrix is used to define the local bosonic action. 
The addition of a global Metropolis test corrects the systematic errors. The overhead of 
the correction test is minimal. Even-odd preconditioning is very simple to implement. It 
reduces the number of required bosonic fields by at least a factor two, and accelerates the 
dynamics by the same factor. 

This algorithm is exact for any choice of the polynomial approximation. No critical 
tuning of the approximation parameters is needed. Only the efficiency of the algorithm, 
which can be monitored, will be affected by the choice of parameters. The cost of the 
algorithm increases with the volume V of the lattice as T^(log V) 2 and with the inverse of 
the quark mass m„ as ~ h a with a ~ 1. This compares favorably with the scaling of 

m q m q 

HMC. 

Finally, for heavy quarks the dynamic properties of our algorithm approach those of 
quenched Monte Carlo, and are considerably faster than HMC in that regime [j/J. We thus 
have presented an algorithm superior to HMC in the limits of heavy and light dynamical 
quarks, and we expect it to be competitive in the intermediate regime. 
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Figure 1: Metropolis acceptance as a function of the number of bosonic fields. The dotted 
line is our 1-parameter ansatz eq.(9). 
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Figure 2: Same as Fig.l, for a different value of K. The fit parameter / (eq.(9)) is 
unchanged from Fig.l. 
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Figure 3: Same as Fig.l, for a different volume. The fit parameter / (eq.(9)) is unchanged 
from Fig.l. 
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Figure 4: Integrated autocorrelation time given by our ansatz (10) as a function of the 
number of fields, measured in trajectories, for lattices of size L = 4, 8, 16, 32 from left to 
right. The Monte Carlo results also shown have been obtained for L = 4. 
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